United States Patent and Trademark Office 



UNITED STATES DEPARTMENT OF COMMERCE 
I nilid Stall-, l'atint and Trademark Office 

Address: COMMISSIONER FOR PATENTS 



APPLICATION NO. FILING DATE FIRST NAMED INVENTOR ATTORNEY DOCKET NO. CONFIRMATION NO. 



MS305080.1/MSFTP475US 



27195 7590 09/04/2008 

AMIN. TUROCY & CALVIN, LLP 
24TH FLOOR, NATIONAL CITY CENTER 
1900 EAST NINTH STREET 
CLEVELAND, OH 44114 



JEAN GILLES, JUDE 



PAPER NUMBER 



NOTIFICATION DATE | DELIVERY MODE 
09/04/2008 ELECTRONIC 



Please find below and/or attached an Office communication concerning this application or proceeding. 



The time period for reply, if any, is set in the attached communication. 



Notice of the Office communication was sent electronically on above-indicated "Notification Date" to the 
following e-mail address(es): 

docketl @ thepatentattorneys.com 
hholnies(« ! lhep;ilenl;Ulorne\ -s.com 
lpasterchck (nMhcpalcnlaUonicys.com 



PTOL-90A (Rev. 04/07) 



Commissioner for Patents 
United States Patent and Trademark Office 
P.O. Box 1450 
Alexandria, VA 22313-1450 

www.uspto.gov 



BEFORE THE BOARD OF PATENT APPEALS 
AND INTERFERENCES 



Application Number: 10/670,681 
Filing Date: September 25, 2003 
Appellant(s): BRILL ET AL. 



Himanshu S. Amin 
Reg. No. 40,894 
For Appellant 



EXAMINER'S ANSWER 



This is in response to the appeal brief filed 05/05/2008 appealing from the Office action 
mailed 12/31/2007. 

(1) Real Party in Interest 

A statement identifying by name the real party in interest is contained in the brief. 




United States Patent and Trademark Office 
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(2) Related Appeals and Interferences 

The examiner is not aware of any related appeals, interferences, or judicial 
proceedings which will directly affect or be directly affected by or have a bearing on the 
Board's decision in the pending appeal. 

(3) Status of Claims 

The statement of the status of claims contained in the brief is correct. 

(4) Status of Amendments After Final 

The appellant's statement of the status of amendments after final rejection 
contained in the brief is correct. 

(5) Summary of Claimed Subject Matter 

The summary of claimed subject matter contained in the brief is correct. 

(6) Grounds of Rejection to be Reviewed on Appeal 

The appellant's statement of the grounds of rejection to be reviewed on appeal is 
correct. 

(7) Claims Appendix 

The copy of the appealed claims contained in the Appendix to the brief is correct. 

(8) Evidence Relied Upon 

2006/0167864 Bailey Jul. 27, 2006 

2004/0240388 Albion Dec. 02, 2004 

(9) Grounds of Rejection 

The following ground(s) of rejection are applicable to the appealed claims: 
Claim Rejections - 35 USC § 103 



Application/Control Number: 10/670,681 
Art Unit: 2143 



Page 3 



2. The following is a quotation of 35 U.S.C. 1 03(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

3. Claims 1-7, 9-21, 23-44, 46, 47, 49-65, 67-76, 78-92, 95-100, 102, 103,105-112 

and 114-116 are rejected under 35 U.S.C. 103(a) as being unpatentable over Bailey et 
al (Bailey), Pub. No. 20060167864 A1 in view of Albion etal. (hereinafter Albion) US. 
Pub. No. 20040240388 A1 . 

Regarding claim 1: Bailey discloses the invention substantially as claimed. Bailey 
teaches a data analysis system (fig. 1), comprising: 

a first component associated with a server of the data analysis system that 
facilitates generation of a first data set related to web page information obtained via a 
communication system (fig. 1, item 120; par. 0029); and 

a second component that coordinates a data set relating to web page information 
from at least one distributed resource which interacts with the communication system; 
the second data set is utilized to refine the first data set (see abstract, fig. 1 ; note that 
the web crawler (160) generates the data set through the Internet; 0037-0040, 0052). 
However Bailey does not specifically discloses a system, wherein refining the first data 
set comprises adding unknown information to the first data set, when new information is 
received from the distributed source via the second data set or updating existing 
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information in the first data set when changes have occurred in the contents of the web 
page information as indicated by the second data set. Nonetheless, this feature is well 
known and would have been an obvious modification to the system shown by Bailey as 
evidenced by Albion. 

In an analogous art, Albion shows a plurality of clients capable sending updates 
information to a crawler, this request information is the second data which updates the 
timer list with new data (see Albion, par. 0020-0022). In an attempt to update the web 
crawler with new information from the distributed source, change information is passed 
from the clients to the crawler with update timer list contents. 

Given this feature, a person of ordinary skill in the art would have been readily 
recognized the desirability and advantages of modifying the system shown by Bailey to 
employ the features shown by Albion in order to facilitate the dynamic content updates 
(see Albion, par. 0001 , 001 1 ). In referring to 2, Albion shows a crawler receiving timer 
information from a client and updates its timer list in memory with the new information. 
By this rationale, claim 1 is rejected. 

Regarding claims 1-7, 9-21, 23-44, 46, 47, 49-65, 67-76, 78-92, 95-100, 102, 103,105- 
112 and 114-116, the combination Bailey-Albion teaches: 

2. The system of claim 1 , the first component comprising an internet web crawler (see 
Bailey; 120). 



Application/Control Number: 10/670,681 Page 5 

Art Unit: 2143 

3. The system of claim 1 , the first component comprising an intranet web crawler (see 
Bailey; 120; the crawler is usable equally in the Internet, as well as an Intranet). 

4. The system of claim 1 , the second component further utilized to optimize reception of 
data from the distributed resources (see Bailey; 164). 

5. The system of claim 1 , the second component provides a scheduling function to 
control reception of the second data set from the at least one distributed resource (see 
Bailey; 147). 

6. The system of claim 1 , the second component utilized to facilitate communication 
traffic reduction via the communication system by employing a proper set of weak 
indicator functions representative of the first data set (see Bailey; 162). 

7. The system of claim 6, the second component further utilized to randomly select and 
transmit a weak indicator function selected from the proper set of weak indicator 
functions to at least one of the distributed resources (see Bailey; 160, 162, 164). 

9. The system of claim 1 , the second component further utilized to generate status 
information about data related to the first data set; the status information transmitted to 
at least one distributed resource (see Bailey; fig. 5; 0070). 
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10. The system of claim 9, the status information comprising, at least in part, a 
freshness flag to indicate freshness of information related to the first data set (see 
Bailey; fig. 5; 0070). 

1 1 . The system of claim 9, the status information comprising, at least in part, a hash of 
contents of information related to the first data set (see Bailey; fig. 5; 0070, 0076). 

12. The system of claim 9, the status information comprising, at least in part, a copy of 
information of the first data set (see Bailey; fig. 5; 0070, 0076). 

13. The system of claim 1 , the communication system comprising an internet (see 
Bailey; 110, 120, 130). 

14. The system of claim 1 , the communication system comprising a world wide web 
(see Bailey; 110, 120, 130). 

15. The system of claim 1, the communication system comprising an intranet (see 
Bailey; 110, 120, 130). 

16. The system of claim 15, the intranet comprising a local area network, (see Bailey; 
130). 
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17. The system of claim 15, the intranet comprising a wide area network (see Bailey; 
110, 120, 130). 

18. The system of claim 1 , the distributed resources comprising clients of a server (see 
Bailey; 110, 120, 130). 

19. The system of claim 1 , the distributed resources comprising trusted entities 
interactive with the communication system and the second component (see Bailey; fig. 
2,5). 

20. The system of claim 1 , the first data set comprising internet web page data (see 
Bailey; 0043, 0070, 0087; fig. 1 & 2). 

21 . The system of claim 1 , the first data set comprising intranet web page data (see 
Bailey; 0043, 0070, 0087; fig. 1 & 2). 

23. The system of claim 1 , the second data set comprising, at least in part, a hash of 
contents of at least one web page (see Bailey; 0040, 0070, 0087; fig. 1, 2, & 5). 

24. The system of claim 1 , the second data set comprising, at least in part, a Uniform 
Resource Locator (URL) of at least one web page (see Bailey; 0040, 0070, 0087; fig. 1 , 
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25. The system of claim 1 , the second data set comprising, at least in part, a time stamp 
relating to an acquisition time for information about at least one web page (see Bailey; 
0043, 0070, 0087; fig. 1 & 2). 

26. The system of claim 1 , the second data set comprising, at least in part, a delta 
indication of the changes to contents of at least one web page (see Bailey; 0043, 0070, 
0087; fig. 1 & 2). 

27. The system of claim 26, the delta indication including, at least in part, a hash of 
previous contents of a web page and a hash of recent contents of the web page (see 
Bailey; 0028, 0043, 0070, 0087; 0076, fig. 1 & 2). 

28. The system of claim 1 , the second data set comprising, at least in part, a status 
indication of changes to contents of at least one web page(see Bailey; 0028, 0043, 
0070, 0087; 0076, fig. 1 & 2). 

29. The system of claim 28, the status indication including, at least in part, a percentage 
relating to an amount of change of contents of a web page (see Bailey; 0028, 0043, 
0070, 0087; 0076, fig. 1 & 2). 
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30. The system of claim 28, the status indication including, at least in part, a 
significance indicator to signify importance of changes in contents of a web page (see 
Bailey; 0028, 0043, 0070, 0087; 0076, fig. 1 & 2). 

31 . The system of claim 1 , the second data set comprising internet web page data (see 
Bailey; 0028, 0043, 0070, 0087; 0076, fig. 1 & 2). 

32. The system of claim 1 , the second data set comprising intranet web page data (see 
Bailey; 0028, 0043, 0070, 0087; 0076, fig. 1 & 2). 

33. The system of claim 1 , the second data set comprising data compiled utilizing at 
least one weak indicator function randomly selected from a set of weak indicator 
functions; the set of weak indicator functions representative of the first data set (see 
Bailey; 0028, 0043, 0070, 0087; 0076, fig. 1 & 2). 

34. The system of claim 1 , further comprising a search component to accept at least 
one search query and generate at least one search reply having at least a portion of the 
first data set represented by information embedded in the search reply (see Bailey; 
0028, 0043, 0070, 0087; 0076, fig. 1 & 2). 

35. The system of claim 1 , further comprising a web page server component to 
construct web pages having at least a portion of the first data set represented by 
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information embedded in at least one link found on at least one constructed web page 
(see Bailey; 0028, 0043, 0070, 0087; 0076, fig. 1 & 2). 

36. The system of claim 1 further comprising a storage component to store the first data 
set (see Bailey; 0028, 0043, 0070, 0087; 0076, fig. 1 & 2). 

37. A method for facilitating data analysis, comprising: 

generating a first data set relating to a second data set obtained from web pages 
interactive with a server of a communication system (see Bailey; see abstract; fig. 1 ; 
(see Bailey; 0037-0040, 0052); 

receiving a third data set from at least one distributed resource that is interactive with 
the communication system; the third data set comprising web page related information 
generated by the distributed resource; and refining the second data set to reflect 
information obtained from the third data set (see Bailey; 0084-0088); 
adding unknown information to the second data set when new information is received 
from the distributed source via the third data set; 

updating existing information in the second data set when changes have occurred as 
indicated by the third data set; and 

passing status information to the distributed resource through one or more indicators 
after information from the third data set has been analyzed (see Albion, par. 0020- 
0022). 
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38. The method of claim 37, the first data set comprising a representation of the second 
data set (see Bailey; see abstract; fig. 1; 0037-0040, 0052). 

39. The method of claim 38, the representation of the second data set comprising, at 
least in part, a hash of contents of at least one web page contained in the second data 
set (see Bailey; 0028, 0043, 0070, 0087; 0076, fig. 1 & 2). 

40. The method of claim 38, the representation of the second data set comprising, at 
least in part, a status indication of at least one web page contained in the second data 
set (see Bailey; 0028, 0043, 0070, 0087; 0076, fig. 1 & 2). 

41 . The method of claim 40, the status indication comprising a freshness flag to indicate 
if the web page information is current (see Bailey; fig. 5; 0070). 

42. The method of claim 37, the first data set comprising a copy of the second data set 
(see Bailey; 0028, 0043, 0070, 0087; 0076, fig. 1 & 2). 

43. The method of claim 37, the second data set comprising web page information 
compiled by a web crawler (see Bailey; 0028, 0043, 0070, 0087; 0076, fig. 1 & 2). 

44. The method of claim 37, the third data set comprising web page information based 
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upon client accessed web page information on the communication system (see Bailey; 
0028, 0043, 0070, 0087; 0076, fig. 1 & 2). 

46. The method of claim 37, the communication system comprising an internet (see 
Bailey; fig. 1). 

47. The method of claim 37, the communication system comprising an intranet (see 
Bailey; fig. 1). 

49. The method of claim 37, further including: transmitting the first data set to at least 
one distributed resource that is interactive with the communication system making the 
first data set available to be utilized by the distributed resource to generate the third 
data set (see Bailey; 0028, 0084-0088; 0037-0043, 0070, 0087; 0076, fig. 1 & 2). 

50. The method of claim 38, further including: generating a set of weak indicator 
functions to represent the second data set; and selecting random weak indicator 
functions from the set of weak indicator functions to transmit to the distributed resources 
as the first data set (see Bailey; 0028, 0084-0088; 0037-0043, 0070, 0087; 0076, fig. 1 
&2). 



51 . The method of claim 50, the set of weak indicator functions comprising a proper set 
of weak indicator functions such that a non-zero probability exists that a randomly 
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selected weak indicator function can identify a new web page (see Bailey; 0028, 0084- 
0088; 0037-0043, 0070, 0087; 0076, fig. 1 & 2). 

52. The method of claim 50, generating a set of weak indicator functions comprising: 
providing a dictionary representative of the second data set; partitioning randomly the 
dictionary into non-overlapping subdictionaries; and creating a function where l(x)=1 if 
and only if at least one subdictionary's weak indicator function is equal to one (see 
Bailey; 0076-0080). 

53. The method of claim 37, further including: comparing the third data set to the 
second data set to reveal spoof data included in the second data set (see Bailey; 0028, 
0084-0088; 0037-0043, 0070, 0087; 0076, fig. 1 & 2). 

54. The method of claim 37, further including: optimizing reception of at least one third 
data set through scheduling of the distributed resources (see Bailey; 0028, 0084-0088; 
0037-0043, 0070, 0087; 0076, fig. 1 & 2). 

55. The method of claim 37, further including: receiving a web page search query from 
at least one distributed resource; generating a web search results page in response to 
the web page search query from the distributed resource; embedding portions of the 
first data set in links found on the web search results page; and transmitting the web 
search results page as a representation of at least a portion of the second data set to 
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the distributed resource (see Bailey; 0028, 0084-0088; 0037-0043, 0070, 0087; 0076, 
fig. 1 & 2). 

56. The method of claim 37, further including: constructing a web page utilizing at least 
a portion of the first data set to embed information about links found in the web page; 
and transmitting the web page to disseminate the first data set to at least one distributed 
resource (see Bailey; 0028, 0084-0088; 0037-0043, 0070, 0087; 0076, fig. 1 & 2). 

57. A data analysis system, comprising: 

means for generating at least one first data set from a server of communication system; 
means for receiving and coordinating at least one second data set from at least one 
d i otr i butod rooourco client which interacts with the server of the communication system; 
and means for refining the first data set utilizing at least one second data set (see 
Bailey; 0028, 0084-0088; 0037-0043, 0070, 0087; 0076, fig. 1 & 2); 
wherein refining the first data set comprises the at least one of adding unknown 
information to the first data set when new information is received from the client via the 
second data set and updating existing information in the first data set when changes 
have occurred in the web pare as indicated by the second data set (see Albion, par. 
0020-0022). 

61 . A data analysis system, comprising: a first component associated with at least one 
client of a distributed web crawling system 
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that generates web page information from at least one visited web site for utilization in 
[a] the distributed web crawling system; tho wob pogo i nformat i on transm i tted by tho 
f i rst component to a second component ; and (see Bailey; 0028, 0084-0088; 0037-0043, 
0070, 0087; 0076, fig. 1 & 2) 

a second component associated with a server that receives the web page information 
transmitted by the first component via a communication system, wherein the first 
component receives a set of data from the second component to utilize in the 
generation of the web page information comprising at least comparison data based on 
the visited web page and the received set of data (see Albion, par. 0020-0022). 

92. A method for facilitating data analysis, comprising: compiling a first data set derived 
from accessing web pages via a client of a communication system; transmitting, 
selectively, the first data set to an entity comprising at least a server of a distributed 
crawling system that is interactive with the communication system (see Bailey; 0028, 
0084-0088; 0037-0043, 0070, 0087; 0076, fig. 1 & 2). 

receiving a representation of a second data set compiled by the server of the web 
crawler; 

the second data set relating to at least one web page from the communication system; 
and utilizing the second data set to control which web pages to visit to compile the first 
data set (see Albion, par. 0020-0022). 
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114. 114. (Currently Amended) A computer readable medium having stored thereon 
computer executable components comprising: 

a first component associated with a server of the data analysis system that 
facilitates generation of a first data set related to web page information obtained via a 
communication system; and a second component that coordinates a second data set 
relating to web page information from at least one distributed resource associated with 
at least a client of the server which interacts with the communication system; (see 
Bailey; 0028, 0084-0088; 0037-0043, 0070, 0087; 0076, fig. 1 & 2) 

the second data set is utilized to refine the first data set, wherein refining the first data 
set comprises adding unknown information to the first data set when new information is 
received from the distributed source via the second data set and updating existing 
information in the first data set when changes have occurred in the contents of the web 
page information as indicated by the second data set(see Albion, par. 0020-0022). 

Claims 58-60, 62-65, 67-76, 78-92,95-1100, 102, 103, 105-112, and 115-116 are 
similar to other claims addressed above (see rejection of claims 2-56 above). 

(10) Response to Arguments 

Appellants request a reversal of the rejection of claims 1-7, 9-21, 23-44, 46, 47, 49-65, 
67-76, 78-92, 95-100, 102, 103, 105-112 and 114-116, rejected as unpatentable under 
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35 U.S.C. § 103(a) over Bailey etal, (U.S. 20060167864) in view of Albion etal. (U.S. 
20040240388). Appellants contend that Bailey et al. and Albion et al. alone or in 
combination fail to teach or suggest all features set forth in the subject claims. The 
reasons for requesting the reversal of this rejection are addressed below. 

Issue 1) Appellants contend the cited document is silent regarding utilizing web page 
information communicated bv a client of the distributed web crawler system to update its 
original crawler web page data to reflect a new web page or change of contents in a 
known web page . For example, Bailey et al. teaches a conventional web crawler 
implemented by a server but does not teach or suggest that the web crawler 160 is 
updated with inputs from the clients 110 (See Bailey etal. Fig. 1 and paragraph [0037]) 
Thus Bailey et al. does not disclose a distributed web crawler wherein a client updates 
web pages associated with a server of the distributed system as recited by the subject 
claims. 

Issue 1 response: The Examiner disagrees with the Appellants assertion with regards 
the point of contention related to issue 1 . It is the position of the Examiner that both 
Bailey and Albion discloses the step of "regarding utilizing web page information 
communicated by a client of the distributed web crawler system to update its original 
crawler web page data to reflect a new web page or change of contents in a known web 
page". Bailey teaches a web crawler that is initially evaluated, according to a set of 
content-based rules to generate a score that indicates a likelihood that the web page 
includes a product offering or content of other pages from the same site (see Bailey par. 
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001 1 ). Note that it well known in the art that those other pages can be submitted by 
client or another server, submitting web pages containing product offering relevant to 
user's search. In fig. 5 of Bailey, database 147 is refreshed frequently, updating the 
original crawler web page data, thereby reflecting a new web pages with new content 
offerings gathered from other client or server systems (see Bailey, par. 0042, and 0070) 

Issue 2) Going back to issue 1 above, the Appellants state that Therefore, it is 
concluded that Albion et al. is silent regarding refining the second data set to reflect 
information obtained from the third data set by adding unknown information to the 
second data set when new information is received from the distributed source via the 
third data set as recited by the subject claims. 

Issue 2 response: The Examiner disagrees with this assertion. In addition to the 
teachings of Bailey, Albion discloses in par. 0018 "the crawler 204 performs various 
functions including the processing of client requests, detection of timeouts of the 
timers, notification of timeouts, and management of the list of available timers 208. 
Client requests may include a request to allocate an available timer to a specified 
connection, a request to de-allocate a timer, a request to start a timer, or a request to 
204 maintains the list of available timers 208, including timers that have not 
been allocated or timers that have timed out'. Client request and/or timeout timers are 
updated in crawler 104, updating it original contents. By this rationale, the rejection of 



Application/Control Number: 10/670,681 Page 19 

Art Unit: 2143 

claims 1-7, 9-21, 23-44, 46, 47, 49-65, 67-76, 78-92, 95-100, 102, 103, 105-112 and 
114-116 are sustained. 

For the above reasons, it is believed that the rejections should be sustained. 
(11) Related Proceed ing(s) Appendix 

No decision rendered by a court or the Board is identified by the examiner in the 
Related Appeals and Interferences section of this examiner's answer. 

Any inquiry concerning this communication or earlier communications from examiner 
should be directed to Jude Jean-Gilles whose telephone number is (571) 272-3914. 
The examiner can normally be reached on Monday-Thursday and every other Friday 
from 8:00 AM to 5:30 PM. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Tonia Dollinger, can be reached on (571) 272-4170. The fax phone number 
for the organization where this application or proceeding is assigned is (571 ) 273-3301 . 

Any inquiry of a general nature or relating to the status of this application or 
proceeding should be directed to the receptionist whose telephone number is (571) 272- 
0800. 

Respectfully submitted, 

/Jude J Jean-Gilles/ 

Primary Examiner, Art Unit 2143 
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August 22, 2008 



/Tonia LM Dollinger/ 

Supervisory Patent Examiner, Art Unit 2143 



/Nathan J. Flynn/ 

Supervisory Patent Examiner, Art Unit 2154 



